{
 "nbformat": 4,
 "nbformat_minor": 0,
 "metadata": {
  "colab": {
   "provenance": [],
   "machine_shape": "hm"
  },
  "kernelspec": {
   "name": "python3",
   "display_name": "Python 3"
  },
  "language_info": {
   "name": "python"
  }
 },
 "cells": [
  {
   "cell_type": "markdown",
   "source": [
    "# Llama Pack - Neo4j Query Engine\n",
    "\n",
    "<a href=\"https://colab.research.google.com/github/run-llama/llama-hub/blob/main/llama_hub/llama_packs/neo4j_query_engine/llama_packs_neo4j.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "\n",
    "This Llama Pack creates a Neo4j knowledge graph query engine, and executes its `query` function. This pack offers the option of creating multiple types of query engines for Neo4j knowledge graphs, namely:\n",
    "\n",
    "* Knowledge graph vector-based entity retrieval (default if no query engine type option is provided)\n",
    "* Knowledge graph keyword-based entity retrieval\n",
    "* Knowledge graph hybrid entity retrieval\n",
    "* Raw vector index retrieval\n",
    "* Custom combo query engine (vector similarity + KG entity retrieval)\n",
    "* KnowledgeGraphQueryEngine\n",
    "* KnowledgeGraphRAGRetriever\n",
    "\n",
    "For this notebook, we will load a Wikipedia page on paleo diet into Neo4j KG and perform queries."
   ],
   "metadata": {
    "id": "lTR76MnbHlMF"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "!pip install llama_index llama_hub neo4j"
   ],
   "metadata": {
    "id": "gZhtMcUkhCRn"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "source": [
    "import os, openai, logging, sys\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = \"sk-#######################\"\n",
    "\n",
    "logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)"
   ],
   "metadata": {
    "id": "9WMxhw2Yjpjs"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Setup Data\n",
    "\n",
    "Load a Wikipedia page on paleo diet."
   ],
   "metadata": {
    "id": "y8B2WvxtNs0V"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "from llama_index import download_loader\n",
    "\n",
    "WikipediaReader = download_loader(\"WikipediaReader\")\n",
    "loader = WikipediaReader()\n",
    "documents = loader.load_data(pages=[\"Paleolithic diet\"], auto_suggest=False)\n",
    "print(f\"Loaded {len(documents)} documents\")"
   ],
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "2xC1ThySNsAY",
    "outputId": "a96de42a-ace1-4b36-f3ab-a29d6bd921ad"
   },
   "execution_count": null,
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "Loaded 1 documents\n"
     ]
    }
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Download and Initialize Pack"
   ],
   "metadata": {
    "id": "icH9lDT7LAQH"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "from llama_index.llama_pack import download_llama_pack\n",
    "\n",
    "# download and install dependencies\n",
    "Neo4jQueryEnginePack = download_llama_pack(\"Neo4jQueryEnginePack\", \"./neo4j_pack\")"
   ],
   "metadata": {
    "id": "-SOCDPS32GM7"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "Assume you have the credentials for Neo4j stored in `credentials.json` at the project root, you load the json and extract the credential details."
   ],
   "metadata": {
    "id": "l1oGbRIPN3RS"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "import json\n",
    "\n",
    "# get Neo4j credentials (assume it's stored in credentials.json)\n",
    "with open(\"credentials.json\") as f:\n",
    "    neo4j_connection_params = json.load(f)\n",
    "    username = neo4j_connection_params[\"username\"]\n",
    "    password = neo4j_connection_params[\"password\"]\n",
    "    url = neo4j_connection_params[\"url\"]\n",
    "    database = neo4j_connection_params[\"database\"]"
   ],
   "metadata": {
    "id": "KO0oa4GJ0_Gx"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "See below how `Neo4jQueryEnginePack` is constructed.  You can pass in the `query_engine_type` from `Neo4jQueryEngineType` to construct `Neo4jQueryEnginePack`. The code snippet below shows a KG keyword query engine.  If `query_engine_type` is not defined, it defaults to KG vector based entity retrieval.\n",
    "\n",
    "`Neo4jQueryEngineType` is an enum, which holds various query engine types, see below. You can pass in any of these query engine types to construct `Neo4jQueryEnginePack`.\n",
    "```\n",
    "class Neo4jQueryEngineType(str, Enum):\n",
    "    \"\"\"Neo4j query engine type\"\"\"\n",
    "\n",
    "    KG_KEYWORD = \"keyword\"\n",
    "    KG_HYBRID = \"hybrid\"\n",
    "    RAW_VECTOR = \"vector\"\n",
    "    RAW_VECTOR_KG_COMBO = \"vector_kg\"\n",
    "    KG_QE = \"KnowledgeGraphQueryEngine\"\n",
    "    KG_RAG_RETRIEVER = \"KnowledgeGraphRAGRetriever\"\n",
    "```"
   ],
   "metadata": {
    "id": "vTUiXf5xLRRO"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "from llama_hub.llama_packs.neo4j_query_engine.base import Neo4jQueryEngineType\n",
    "\n",
    "# create the pack\n",
    "neo4j_pack = Neo4jQueryEnginePack(\n",
    "    username=username,\n",
    "    password=password,\n",
    "    url=url,\n",
    "    database=database,\n",
    "    docs=documents,\n",
    "    query_engine_type=Neo4jQueryEngineType.KG_KEYWORD,\n",
    ")"
   ],
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "MldEbU3xz3Yk",
    "outputId": "062e4b27-dd9c-425c-9d32-b889d2fe55b5"
   },
   "execution_count": null,
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "loaded nodes with 8 nodes\n"
     ]
    }
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Run Pack"
   ],
   "metadata": {
    "id": "2ViEkJinLLiH"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "from IPython.display import Markdown\n",
    "\n",
    "response = neo4j_pack.run(\"Tell me about the benefits of paleo diet.\")\n",
    "display(Markdown(f\"<b>{response}</b>\"))"
   ],
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 81
    },
    "id": "mn2DIWuX1XBR",
    "outputId": "67edb07d-2d7c-4c84-da60-01cca8ec7a64"
   },
   "execution_count": null,
   "outputs": [
    {
     "output_type": "display_data",
     "data": {
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ],
      "text/markdown": "<b>The benefits of the paleo diet include a re-imagining of what Paleolithic people ate, a diet that is 65% plant-based, and the forbidding of consumption of all dairy products. The diet is based on the evolutionary discordance hypothesis, which suggests that many chronic diseases and degenerative conditions evident in modern Western populations have arisen because of a mismatch between Stone Age genes and modern lifestyles.</b>"
     },
     "metadata": {}
    }
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "Let's try out the KG hybrid query engine. See code below.  You can try any other query engines in a similar way by replacing the `query_engine_type` with another query engine type from `Neo4jQueryEngineType` enum."
   ],
   "metadata": {
    "id": "a8JoNXXwL9_M"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "neo4j_pack = Neo4jQueryEnginePack(\n",
    "    username=username,\n",
    "    password=password,\n",
    "    url=url,\n",
    "    database=database,\n",
    "    docs=documents,\n",
    "    query_engine_type=Neo4jQueryEngineType.KG_HYBRID,\n",
    ")\n",
    "\n",
    "response = neo4j_pack.run(\"Tell me about the benefits of paleo diet.\")\n",
    "display(Markdown(f\"<b>{response}</b>\"))"
   ],
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 133
    },
    "id": "vEzStIYvJa9S",
    "outputId": "95136bdf-68a7-4dd7-912b-ca4cd6d2fb23"
   },
   "execution_count": null,
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "loaded nodes with 8 nodes\n"
     ]
    },
    {
     "output_type": "display_data",
     "data": {
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ],
      "text/markdown": "<b>The paleo diet is believed to have several benefits. It is thought that following this diet may lead to improvements in body composition and metabolism compared to the typical Western diet. The emphasis on whole, unprocessed foods in the paleo diet can also help reduce the intake of added sugars and salt. Additionally, the diet may promote satiety, which can aid in weight loss. However, it is important to note that the paleo diet can lead to nutritional deficiencies, such as inadequate calcium intake, and may increase the risk of ingesting toxins from high fish consumption. The effectiveness of the paleo diet in reducing the risk of cardiovascular disease or treating inflammatory bowel disease is not supported by strong evidence.</b>"
     },
     "metadata": {}
    }
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Comparison of the Knowledge Graph Query Strategies\n",
    "\n",
    "The table below lists the details of the 7 query engines, and their pros and cons based on experiments with NebulaGraph and LlamaIndex, as outlined in the blog post [7 Query Strategies for Navigating Knowledge Graphs with LlamaIndex](https://betterprogramming.pub/7-query-strategies-for-navigating-knowledge-graphs-with-llamaindex-ed551863d416?sk=55c94ad72e75aa52ac6cc21d8145b37d)."
   ],
   "metadata": {
    "id": "XoK0BiKZDMaJ"
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "![Knowledge Graph query strategies comparison](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*0UsLpj7v2GO67U-99YJBfg.png)"
   ],
   "metadata": {
    "id": "DVr0XiFBGMY-"
   }
  }
 ]
}
